1 On Bidirectional English - Arabic Search

نویسندگان

  • M. Aljlayl
  • O. Frieder
  • D. Grossman
چکیده

In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is the key problem. We present three methods of query translation using a bilingual dictionary for Arabic-English CLIR. First, we present the Every-Match (EM) method. This method yields ambiguous translations since many extraneous terms are added to the original query. To disambiguate query translation, we present the First-Match (FM) method that considers the first match in the dictionary as the candidate term. Finally, we present the Two-Phase (TP) method. We show that good retrieval effectiveness can be achieved without complex resources using the Two-Phase method for Arabic-English CLIR. We also empirically evaluate the effectiveness of the Arabic-English MT approach using short, medium, and long queries of TREC7 and TREC9 topics and collections. The effects of the query length to the quality of the MT-based CLIR are investigated. English-Arabic CLIR is evaluated via MRD and English-Arabic MT. The query expansion via post-translation approach is used to de-emphasize the extraneous terms introduced by the MRD and MT for English-Arabic CLIR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

QArabPro: A Rule Based Question Answering System for Reading Comprehension Tests in Arabic

Problem statement: Extensive research efforts in the area of Natural Language Processing (NLP) were focused on developing reading comprehension Question Answering systems (QA) for Latin based languages such as, English, French and German. Approach: However, little effort was directed towards the development of such systems for bidirectional languages such as Arabic, Urdu and Farsi. In general, ...

متن کامل

The Reality of Arabic Fiction Translation into English: A Sociological Approach

English translations of texts associated with Arabic fiction remain largely unexplored from a sociological perspective. Drawing on Pierre Bourdieu’s sociology, this paper aims to examine the genesis of Arabic fiction translation into English as a socially situated activity. Works of Arabic fiction emerged in English translation in the early twentieth century. Since then, this intellectual field...

متن کامل

Translation Modeling with Bidirectional Recurrent Neural Networks

This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consistent with phrasebased decoding. Moreover, we introduce bidirectional recurrent neural models to the problem of machine translation, allowing us to use the full source sentence in ...

متن کامل

Recover Writing Trajectory from Multiple Stroked Image

The recovery of writing trajectory from offline handwritten image is generally regarded as a difficult problem [1]. This paper introduced a method to recover the writing trajectory from multiple stroked images by searching the best matching writing paths of template strokes. The searching procedure is guided by a matching cost function which is defined as the summation of positional distortion ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002